Quantcast

Maker-Error when started with IMPI

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Maker-Error when started with IMPI

Rainer Rutka

HI!

Running maker with mpiexec causes this error-message:

mpiexec_uc1n077.localdomain: cannot connect to local mpd
(/scratch/mpd2.console_uc1n077.localdomain_kn_pop235844); possible causes:
1. no mpd is running on this host
2. an mpd is running but was started without a "console" (-n option)
### Cleaning up files ... removing unnecessary scratch files ...

And yes, we don't have mpd running.

Environment used is:

Currently Loaded Modulefiles:
1) compiler/intel/16.0(default)
2) mpi/impi/5.1.3-intel-16.0(default)
3) bio/maker/2.31.8_impi

Running maker with mpiexec using only 1 node and 8 cores.

mpiexec -n 8 maker

:-(

Any suggestions ?

--
Rainer Rutka
University of Konstanz
Communication, Information, Media Centre (KIM)
* High-Performance-Computing (HPC)
* KIM-Support and -Base-Services
Room: V511
78457 Konstanz, Germany
+49 7531 88-5413


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Maker-Error when started with IMPI

Carson Holt-2
It means one of two things.

1. The mpiexec you called is not from Intel MPI (try 'which mpiexec' to verify the location and that it is not some other random mpiexec executable).
2. It is an extremely old version of Intel MPI (in which case the instructions I gave you would not apply)


MPD is an old launcher used in MPICH1 and early versions of MPICH2. It’s been abandoned since about 2008 when MPICH2 switched to the hydra launcher which is still used in MPICH3.

So either you are pointing to an old version of MPICH or you have a very old version of Intel MPI based off of MPICH.

—Carson

> On Feb 23, 2017, at 6:10 AM, Rainer Rutka <[hidden email]> wrote:
>
>
> HI!
>
> Running maker with mpiexec causes this error-message:
>
> mpiexec_uc1n077.localdomain: cannot connect to local mpd (/scratch/mpd2.console_uc1n077.localdomain_kn_pop235844); possible causes:
> 1. no mpd is running on this host
> 2. an mpd is running but was started without a "console" (-n option)
> ### Cleaning up files ... removing unnecessary scratch files ...
>
> And yes, we don't have mpd running.
>
> Environment used is:
>
> Currently Loaded Modulefiles:
> 1) compiler/intel/16.0(default)
> 2) mpi/impi/5.1.3-intel-16.0(default)
> 3) bio/maker/2.31.8_impi
>
> Running maker with mpiexec using only 1 node and 8 cores.
>
> mpiexec -n 8 maker
>
> :-(
>
> Any suggestions ?
>
> --
> Rainer Rutka
> University of Konstanz
> Communication, Information, Media Centre (KIM)
> * High-Performance-Computing (HPC)
> * KIM-Support and -Base-Services
> Room: V511
> 78457 Konstanz, Germany
> +49 7531 88-5413
>
> _______________________________________________
> maker-devel mailing list
> [hidden email]
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Maker-Error when started with IMPI

Rainer Rutka
Hi Carson.

First of all THANK YOU FOR YOUR HELP.
MUCH APPRECIATED. :-)

Am 23.02.2017 um 21:22 schrieb Carson Holt:
> It means one of two things.
> 1. The mpiexec you called is not from Intel MPI (try 'which mpiexec' to verify the location and that it is not some other random mpiexec executable).

No, I am starting the right version of mpiexec.

This is a list of all our current available MPI-Versions, corresponding
to their compiler:

UC:[kn@uc1n997 ~]$ module avail mpi
------------------------------------- /opt/bwhpc/common/modulefiles
--------------------------------------
mpi/impi/4.1.3-gnu-4.4 mpi/openmpi/1.10-intel-15.0
mpi/impi/4.1.3-gnu-4.7 mpi/openmpi/1.10-intel-16.0
mpi/impi/4.1.3-intel-14.0 mpi/openmpi/1.8-gnu-4.5
mpi/impi/5.0.3-gnu-4.4 mpi/openmpi/1.8-gnu-4.7
mpi/impi/5.0.3-gnu-4.7 mpi/openmpi/1.8-gnu-4.7-m32
mpi/impi/5.0.3-intel-15.0 mpi/openmpi/1.8-gnu-4.8
mpi/impi/5.1.3-gnu-4.7 mpi/openmpi/1.8-gnu-4.9
mpi/impi/5.1.3-gnu-system mpi/openmpi/1.8-intel-14.0(default)

mpi/impi/5.1.3-intel-16.0(default)

mpi/openmpi/1.8-intel-15.0
mpi/openmpi/1.10-gnu-4.5 mpi/openmpi/2.0-gnu-4.7
mpi/openmpi/1.10-gnu-4.7 mpi/openmpi/2.0-gnu-4.8
mpi/openmpi/1.10-gnu-4.8 mpi/openmpi/2.0-gnu-5.2
mpi/openmpi/1.10-gnu-4.9 mpi/openmpi/2.0-intel-15.0
mpi/openmpi/1.10-gnu-5.2 mpi/openmpi/2.0-intel-16.0
mpi/openmpi/1.10-intel-14.0


Here I load the MPI-Module including all it's dependencies:

UC:[kn@uc1n997 ~]$ module load mpi/impi/5.1.3-intel-16.0
Loading module dependency 'compiler/intel/16.0'.

Result:
UC:[kn@c1n997 ~]$ which mpiexec
/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec
UC:[kn@uc1n997 ~]$

> 2. It is an extremely old version of Intel MPI (in which case the instructions I gave you would not apply)
Extremely old?

[...]
Intel(R) MPI Library for Linux* OS, Version 5.1.3 Build 20160601 (build
id: 15562)

> MPD is an old launcher used in MPICH1 and early versions of MPICH2. It’s been abandoned since about 2008 when MPICH2 switched to the hydra launcher which is still used in MPICH3.
Yes. And we don't even use MPD at our clusters :-)

> So either you are pointing to an old version of MPICH or you have a very old version of Intel MPI based off of MPICH.
I do not have any MPICH available on our cluster(s) now. All of the old
versions had
been removed some years ago.

At a glance
---------------

Maker is available as a so-called module on our cluster system. It's
been build
on a developers' node I can access. But, the MPI-modules are built by other
fellows (e.g. at the KIT in Karlsruhe/Germany) on other nodes.

Please check the module-file (included in this mail)

bio-maker-2.31.8_impi

to see how Maker was build including the envoronments set by this
module.

e.g.:
./Build status
verify dependencies:
===================================================
STATUS MAKER v2.31.8
===================================================
PERL Dependencies: VERIFIED
External Programs: VERIFIED
External C Libraries: VERIFIED
MPI SUPPORT: ENABLED
MWAS Web Interface: DISABLED
MAKER PACKAGE: CONFIGURATION OK


At least, please(!) have a look at our m.moab file (included in this mail).
This is the way how we submit a Maker job to our cluster. Maybe something
is wrong here?

Sorry again for wasting your time. But we imperatively need the Maker
software running in parallel mode.

:-)

--
Rainer Rutka
University of Konstanz
Communication, Information, Media Centre (KIM)
* High-Performance-Computing (HPC)
* KIM-Support and -Base-Services
Room: V511
78457 Konstanz, Germany
+49 7531 88-5413

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

bio-maker-2.31.8_impi (14K) Download Attachment
m.moab (7K) Download Attachment
smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Maker-Error when started with IMPI

Carson Holt-2
Specific things.

1. Do not set LD_PRELOAD. That is only for OpenMPI, but it will cause problems with other MPI's.

2. Make sure you recompiled MAKER for Intel MPI (MPI code always has to be compiled for the flavor you are using, so make sure you have a separate installation of MAKER for Intel MPI). Also validate that the mpicc and libmpi.h listed during the MAKER install belong to Intel MPI. Don’t just assume they do because you loaded the module. Manually verify the paths during MAKER’s setup.

3. The error you got previously should not even be possible with the current version of Intel MPI, which is why I say that when you called mpiexec, something else (that was not Intel MPI) was launched. Easy solution is to give the full path of mpiexec in your job, so are not relying on PATH to be unaltered in your job.

Do not do —>  mpiexec -nc 1 maker
Do this for example —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -nc maker

4. Build and run on the same node for your test. If you build on one node and run on another, you may be changing your environment in ways you don’t realize that break things. So if you can build and test on the same node and it works, then it fails when you test it elsewhere, then you have to track down how your environment is changing.

—Carson





> On Feb 24, 2017, at 1:43 AM, Rainer Rutka <[hidden email]> wrote:
>
> Hi Carson.
>
> First of all THANK YOU FOR YOUR HELP.
> MUCH APPRECIATED. :-)
>
> Am 23.02.2017 um 21:22 schrieb Carson Holt:
>> It means one of two things.
>> 1. The mpiexec you called is not from Intel MPI (try 'which mpiexec' to verify the location and that it is not some other random mpiexec executable).
>
> No, I am starting the right version of mpiexec.
>
> This is a list of all our current available MPI-Versions, corresponding
> to their compiler:
>
> UC:[kn@uc1n997 ~]$ module avail mpi
> ------------------------------------- /opt/bwhpc/common/modulefiles --------------------------------------
> mpi/impi/4.1.3-gnu-4.4 mpi/openmpi/1.10-intel-15.0
> mpi/impi/4.1.3-gnu-4.7 mpi/openmpi/1.10-intel-16.0
> mpi/impi/4.1.3-intel-14.0 mpi/openmpi/1.8-gnu-4.5
> mpi/impi/5.0.3-gnu-4.4 mpi/openmpi/1.8-gnu-4.7
> mpi/impi/5.0.3-gnu-4.7 mpi/openmpi/1.8-gnu-4.7-m32
> mpi/impi/5.0.3-intel-15.0 mpi/openmpi/1.8-gnu-4.8
> mpi/impi/5.1.3-gnu-4.7 mpi/openmpi/1.8-gnu-4.9
> mpi/impi/5.1.3-gnu-system mpi/openmpi/1.8-intel-14.0(default)
>
> mpi/impi/5.1.3-intel-16.0(default)
>
> mpi/openmpi/1.8-intel-15.0
> mpi/openmpi/1.10-gnu-4.5 mpi/openmpi/2.0-gnu-4.7
> mpi/openmpi/1.10-gnu-4.7 mpi/openmpi/2.0-gnu-4.8
> mpi/openmpi/1.10-gnu-4.8 mpi/openmpi/2.0-gnu-5.2
> mpi/openmpi/1.10-gnu-4.9 mpi/openmpi/2.0-intel-15.0
> mpi/openmpi/1.10-gnu-5.2 mpi/openmpi/2.0-intel-16.0
> mpi/openmpi/1.10-intel-14.0
>
>
> Here I load the MPI-Module including all it's dependencies:
>
> UC:[kn@uc1n997 ~]$ module load mpi/impi/5.1.3-intel-16.0
> Loading module dependency 'compiler/intel/16.0'.
>
> Result:
> UC:[kn@c1n997 ~]$ which mpiexec
> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec
> UC:[kn@uc1n997 ~]$
>
>> 2. It is an extremely old version of Intel MPI (in which case the instructions I gave you would not apply)
> Extremely old?
>
> [...]
> Intel(R) MPI Library for Linux* OS, Version 5.1.3 Build 20160601 (build id: 15562)
>
>> MPD is an old launcher used in MPICH1 and early versions of MPICH2. It’s been abandoned since about 2008 when MPICH2 switched to the hydra launcher which is still used in MPICH3.
> Yes. And we don't even use MPD at our clusters :-)
>
>> So either you are pointing to an old version of MPICH or you have a very old version of Intel MPI based off of MPICH.
> I do not have any MPICH available on our cluster(s) now. All of the old versions had
> been removed some years ago.
>
> At a glance
> ---------------
>
> Maker is available as a so-called module on our cluster system. It's been build
> on a developers' node I can access. But, the MPI-modules are built by other
> fellows (e.g. at the KIT in Karlsruhe/Germany) on other nodes.
>
> Please check the module-file (included in this mail)
>
> bio-maker-2.31.8_impi
>
> to see how Maker was build including the envoronments set by this
> module.
>
> e.g.:
> ./Build status
> verify dependencies:
> ===================================================
> STATUS MAKER v2.31.8
> ===================================================
> PERL Dependencies: VERIFIED
> External Programs: VERIFIED
> External C Libraries: VERIFIED
> MPI SUPPORT: ENABLED
> MWAS Web Interface: DISABLED
> MAKER PACKAGE: CONFIGURATION OK
>
>
> At least, please(!) have a look at our m.moab file (included in this mail).
> This is the way how we submit a Maker job to our cluster. Maybe something
> is wrong here?
>
> Sorry again for wasting your time. But we imperatively need the Maker
> software running in parallel mode.
>
> :-)
>
> --
> Rainer Rutka
> University of Konstanz
> Communication, Information, Media Centre (KIM)
> * High-Performance-Computing (HPC)
> * KIM-Support and -Base-Services
> Room: V511
> 78457 Konstanz, Germany
> +49 7531 88-5413
> <bio-maker-2.31.8_impi><m.moab>


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Maker-Error when started with IMPI

Rainer Rutka
Hi Carson.

Again THANK YOU for your efforts :-)



Am 24.02.2017 um 18:30 schrieb Carson Holt:
> Specific things.
>
> 1. Do not set LD_PRELOAD. That is only for OpenMPI, but it will cause problems with other MPI's.
OK, I deleted this envirnoment. Not set any more.

> 2. Make sure you recompiled MAKER for Intel MPI (MPI code always has to be compiled for the flavor you are using, so make sure you have a separate installation of MAKER for Intel MPI). Also validate that the mpicc and libmpi.h listed during the MAKER install belong to Intel MPI. Don’t just assume they do because you loaded the module. Manually verify the paths during MAKER’s setup.
I validated:

UC:[kn@uc1n996 bwhpc-examples]$ module list
Currently Loaded Modulefiles:
1) compiler/intel/16.0(default)
2) mpi/impi/5.1.3-intel-16.0(default)

FOR MPICC:
UC:[kn@uc1n996 bwhpc-examples]$ type mpicc
mpicc is
/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpicc

FOR LIBMPI:
UC:[kn@uc1n996 bwhpc-examples]$ echo $MPIDIR
/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64
UC:[kn@uc1n996 bwhpc-examples]$ find $MPIDIR -name '*'mpi.h -print
/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/include/mpi.h

Here i can find a mpi.h but not a libmpi.h. But I thinks this is o.k.,
because the SW was
compiled and linkes without any errors or missing libs.

> 3. The error you got previously should not even be possible with the current version of Intel MPI,
> which is why I say that when you called mpiexec, something else (that was not Intel MPI) was launched.
> Easy solution is to give the full path of mpiexec in your job, so are not relying on PATH to be unaltered in your job.
mpiexec is in the PATH and the right one is/was used, too.

MPIXEC:
UC:[kn@uc1n996 bwhpc-examples]$ type mpiexec
mpiexec is
/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec
UC:[kn@bwhpc-examples]$

> Do not do —>  mpiexec -nc 1 maker
> Do this for example —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -nc maker
OK, so i did:

[...]
#MSUB -l nodes=1:ppn=1
#MSUB -l mem=20gb
[...]
echo " "
echo "### Runing Maker example"
echo " "
export OMPI_MCA_mpi_warn_on_fork=0
/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec
-nc maker
[...]

> 4. Build and run on the same node for your test. If you build on one node and run on another, you may
> be changing your environment in ways you don’t realize that break things. So if you can build and test on
> the same node and it works, then it fails when you test it elsewhere, then you have to track down how your
> environment is changing.
OK I did. Same node: uc1n996


UNFORTUNATELY I GOT THE SAME ERROR:

[...]
### Runing Maker example

LD_PRELOAD=/opt/bwhpc/common/mpi/openmpi/2.0.1-intel-16.0/lib/libmpi.so
OMPI_MCA_mpi_warn_on_fork=0
I_MPI_CPUINFO=proc
I_MPI_PMI_LIBRARY=/opt/bwhpc/common/mpi/openmpi/2.0.1-intel-16.0/lib/libpmi.so
I_MPI_PIN_DOMAIN=node
I_MPI_FABRICS=shm:tcp
I_MPI_HYDRA_IFACE=ib0
mpiexec_uc1n342.localdomain: cannot connect to local mpd
(/scratch/mpd2.console_uc1n342.localdomain_kn_pop235844); possible causes:
1. no mpd is running on this host
2. an mpd is running but was started without a "console" (-n option)
[...]


> —Carson

tbc. ? :-)

THANX

--
Rainer Rutka
Universität Konstanz
Kommunikations-, Informations-, Medienzentrum (KIM)
* KIM Ausbildung
* Wissenschaftliches Rechnen/bwHPC-C5
* KIM Basisdienste, KIM Support
Raum: V511
78457 Konstanz
+49 7531 88-5413


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Maker-Error when started with IMPI : CORRECTED MAIL : SEE THIS ONE

Rainer Rutka

Sorry, sent wrong e-mail :-(

IGNORE THE FIRST MAIL I SENT!

Am 01.03.2017 um 13:30 schrieb Rainer Rutka:
Hi Carson.
 
Again THANK YOU for your efforts :-)
 
 
Am 24.02.2017 um 18:30 schrieb Carson Holt:
> Specific things.
>
> 1. Do not set LD_PRELOAD. That is only for OpenMPI, but it will cause
> problems with other MPI's.

OK, I deleted this envirnoment. Not set any more.
 
> 2. Make sure you recompiled MAKER for Intel MPI (MPI code always has
> to be compiled for the flavor you are using, so make sure you have a
> separate installation of MAKER for Intel MPI). Also validate that the
> mpicc and libmpi.h listed during the MAKER install belong to Intel
> MPI. Don’t just assume they do because you loaded the module. Manually
> verify the paths during MAKER’s setup.

I validated:
 
UC:[kn@uc1n996 bwhpc-examples]$ module list
Currently Loaded Modulefiles:
1) compiler/intel/16.0(default)
2) mpi/impi/5.1.3-intel-16.0(default)
 
FOR MPICC:
UC:[kn@uc1n996 bwhpc-examples]$ type mpicc
mpicc is
/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpicc
 
 
FOR LIBMPI:
UC:[kn@uc1n996 bwhpc-examples]$ echo $MPIDIR
/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64
UC:[kn@uc1n996 bwhpc-examples]$ find $MPIDIR -name '*'mpi.h -print
/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/include/mpi.h
 
 
Here i can find a mpi.h but not a libmpi.h. But I thinks this is o.k.,
because the SW was  compiled and linkes without any errors or missing libs.
 
> 3. The error you got previously should not even be possible with the
> current version of Intel MPI,
> which is why I say that when you called mpiexec, something else (that
> was not Intel MPI) was launched.
> Easy solution is to give the full path of mpiexec in your job, so are
> not relying on PATH to be unaltered in your job.

mpiexec is in the PATH and the right one is/was used, too:
 
MPIXEC:
UC:[kn@uc1n996 bwhpc-examples]$ type mpiexec
mpiexec is
/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec

> Do not do —>  mpiexec -nc 1 maker
> Do this for example —>
> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec
> -nc maker
 
OK, so i did:
 
[...]
#MSUB -l nodes=1:ppn=1
#MSUB -l mem=20gb
[...]
echo " "
echo "### Runing Maker example"
echo " "
export OMPI_MCA_mpi_warn_on_fork=0
/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec
-nc maker
[...]
 
> 4. Build and run on the same node for your test. If you build on one
> node and run on another, you may
> be changing your environment in ways you don’t realize that break
> things. So if you can build and test on
> the same node and it works, then it fails when you test it elsewhere,
> then you have to track down how your
> environment is changing.

OK I did. Same node: uc1n996
 
 
UNFORTUNATELY I GOT THE SAME ERROR:
 
  [...]
Currently Loaded Modulefiles:
   1) compiler/intel/16.0(default)
   2) mpi/impi/5.1.3-intel-16.0(default)
   3) bio/maker/2.31.8_impi


### Display internal Maker/bwHPC environments...

MAKER_BIN_DIR  = /opt/bwhpc/common/bio/maker/2.31.8_impi/bin
MAKER_EXA_DIR  = /opt/bwhpc/common/bio/maker/2.31.8_impi/bwhpc-examples


### Runing Maker example
OMPI_MCA_mpi_warn_on_fork=0
I_MPI_CPUINFO=proc
I_MPI_PMI_LIBRARY=/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/lib/libmpi.so
I_MPI_PIN_DOMAIN=node
I_MPI_FABRICS=shm:tcp
I_MPI_HYDRA_IFACE=ib0
mpiexec_uc1n326.localdomain: cannot connect to local mpd (/scratch/mpd2.console_uc1n326.localdomain_kn_pop235844); possible causes:
   1. no mpd is running on this host
   2. an mpd is running but was started without a "console" (-n option)
### Cleaning up files ... removing unnecessary scratch files ...
  [...]
 
 
> —Carson
 
  tbc. ? :-)
 
  THANX
 

--
Rainer Rutka
Universität Konstanz
Kommunikations-, Informations-, Medienzentrum (KIM)
* KIM Ausbildung
* Wissenschaftliches Rechnen/bwHPC-C5
* KIM Basisdienste, KIM Support
Raum: V511
78457 Konstanz
+49 7531 88-5413


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Maker-Error when started with IMPI : CORRECTED MAIL : SEE THIS ONE

Carson Holt-2
Try this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 echo Hello
Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h

If both of these fail, there is the chance that the Intel MPI you are using was compiled on a different architecture than the one you are launching it on. In that case the failure indicates a need to reinstall Intel MPI for that architecture.

The following may or may not work if the first two fail:
Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 echo Hello
Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h

Also send me this file —> perl/lib/MAKER/ConfigData.pm

Thanks,
Carson


> On Mar 1, 2017, at 5:51 AM, Rainer Rutka <[hidden email]> wrote:
>
>
> Sorry, sent wrong e-mail :-(
>
> IGNORE THE FIRST MAIL I SENT!
>
> Am 01.03.2017 um 13:30 schrieb Rainer Rutka:
> Hi Carson.
> Again THANK YOU for your efforts :-)
>  Am 24.02.2017 um 18:30 schrieb Carson Holt:
>> Specific things.
>>
>> 1. Do not set LD_PRELOAD. That is only for OpenMPI, but it will cause
>> problems with other MPI's.
>
> OK, I deleted this envirnoment. Not set any more.
>
>> 2. Make sure you recompiled MAKER for Intel MPI (MPI code always has
>> to be compiled for the flavor you are using, so make sure you have a
>> separate installation of MAKER for Intel MPI). Also validate that the
>> mpicc and libmpi.h listed during the MAKER install belong to Intel
>> MPI. Don’t just assume they do because you loaded the module. Manually
>> verify the paths during MAKER’s setup.
>
> I validated:
> UC:[kn@uc1n996 bwhpc-examples]$ module list
> Currently Loaded Modulefiles:
> 1) compiler/intel/16.0(default)
> 2) mpi/impi/5.1.3-intel-16.0(default)
> FOR MPICC:
> UC:[kn@uc1n996 bwhpc-examples]$ type mpicc
> mpicc is
> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpicc
>  FOR LIBMPI:
> UC:[kn@uc1n996 bwhpc-examples]$ echo $MPIDIR
> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64
> UC:[kn@uc1n996 bwhpc-examples]$ find $MPIDIR -name '*'mpi.h -print
> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/include/mpi.h
>  Here i can find a mpi.h but not a libmpi.h. But I thinks this is o.k.,
> because the SW was  compiled and linkes without any errors or missing libs.
>
>> 3. The error you got previously should not even be possible with the
>> current version of Intel MPI,
>> which is why I say that when you called mpiexec, something else (that
>> was not Intel MPI) was launched.
>> Easy solution is to give the full path of mpiexec in your job, so are
>> not relying on PATH to be unaltered in your job.
>
> mpiexec is in the PATH and the right one is/was used, too:
> MPIXEC:
> UC:[kn@uc1n996 bwhpc-examples]$ type mpiexec
> mpiexec is
> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec
>
>> Do not do —>  mpiexec -nc 1 maker
>> Do this for example —>
>> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec
>> -nc maker
> OK, so i did:
> [...]
> #MSUB -l nodes=1:ppn=1
> #MSUB -l mem=20gb
> [...]
> echo " "
> echo "### Runing Maker example"
> echo " "
> export OMPI_MCA_mpi_warn_on_fork=0
> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec
> -nc maker
> [...]
>
>> 4. Build and run on the same node for your test. If you build on one
>> node and run on another, you may
>> be changing your environment in ways you don’t realize that break
>> things. So if you can build and test on
>> the same node and it works, then it fails when you test it elsewhere,
>> then you have to track down how your
>> environment is changing.
>
> OK I did. Same node: uc1n996
>  UNFORTUNATELY I GOT THE SAME ERROR:
>  [...]
> Currently Loaded Modulefiles:
>  1) compiler/intel/16.0(default)
>  2) mpi/impi/5.1.3-intel-16.0(default)
>  3) bio/maker/2.31.8_impi
>
>
> ### Display internal Maker/bwHPC environments...
>
> MAKER_BIN_DIR  = /opt/bwhpc/common/bio/maker/2.31.8_impi/bin
> MAKER_EXA_DIR  = /opt/bwhpc/common/bio/maker/2.31.8_impi/bwhpc-examples
>
>
> ### Runing Maker example
> OMPI_MCA_mpi_warn_on_fork=0
> I_MPI_CPUINFO=proc
> I_MPI_PMI_LIBRARY=/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/lib/libmpi.so
> I_MPI_PIN_DOMAIN=node
> I_MPI_FABRICS=shm:tcp
> I_MPI_HYDRA_IFACE=ib0
> mpiexec_uc1n326.localdomain: cannot connect to local mpd (/scratch/mpd2.console_uc1n326.localdomain_kn_pop235844); possible causes:
>  1. no mpd is running on this host
>  2. an mpd is running but was started without a "console" (-n option)
> ### Cleaning up files ... removing unnecessary scratch files ...
> [...]
>  
>> —Carson
>  tbc. ? :-)
>  THANX
>
> --
> Rainer Rutka
> Universität Konstanz
> Kommunikations-, Informations-, Medienzentrum (KIM)
> * KIM Ausbildung
> * Wissenschaftliches Rechnen/bwHPC-C5
> * KIM Basisdienste, KIM Support
> Raum: V511
> 78457 Konstanz
> +49 7531 88-5413
>


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Maker-Error when started with IMPI : CORRECTED MAIL : SEE THIS ONE

Rainer Rutka
Hi Carson!

Am 02.03.2017 um 01:43 schrieb Carson Holt:
> Try this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 echo Hello
> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h
Same error(s).

> If both of these fail, there is the chance that the Intel MPI you are using was compiled on a different architecture than the one you are launching it on. In that case the failure indicates a need to reinstall Intel MPI for that architecture.
Yes, they fail.

> The following may or may not work if the first two fail:
> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 echo Hello
WORKS FINE!

> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h
WORKS!

> Also send me this file —> perl/lib/MAKER/ConfigData.pm
Attached to this mail.

> Thanks,
> Carson

--
Rainer Rutka
University of Konstanz
Communication, Information, Media Centre (KIM)
  * High-Performance-Computing (HPC)
  * KIM-Support and -Base-Services
Room: V511
78457 Konstanz, Germany
+49 7531 88-5413

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

ConfigData.pm (5K) Download Attachment
smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Maker-Error when started with IMPI : CORRECTED MAIL : SEE THIS ONE

Rainer Rutka
In reply to this post by Carson Holt-2
> The following may or may not work if the first two fail:
> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 echo Hello
> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h

mpirun, !mpiexec is running, too!

--
Rainer Rutka
University of Konstanz
Communication, Information, Media Centre (KIM)
* High-Performance-Computing (HPC)
* KIM-Support and -Base-Services
Room: V511
78457 Konstanz, Germany
+49 7531 88-5413


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Maker-Error when started with IMPI : CORRECTED MAIL : SEE THIS ONE

Carson Holt-2
In reply to this post by Rainer Rutka
This command -> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 echo Hello

All that command does is start the launcher and print “Hello”. So since it failed, it means the issue is with your MPI installation (i.e. Intel MPI itself). It would have to be reinstalled and recompiled. I would not be surprised if the issues with the other MPI flavors you tried were for the same reason.  They were installed for one architecture/compiler/library set, but you are running them on another one. So they always fail.

The second command was an alternate launcher, but it relys on the same underlying libraries as the first one. So if the first one failed, the second one may fail (it may just happen later on).


So the issue boils down to one thing —> Your MPI is the issue. You need to reinstall/reconfigure and once you can get your MPI working, you can move onto trying MAKER.

Thanks,
Carson



> On Mar 2, 2017, at 1:41 AM, Rainer Rutka <[hidden email]> wrote:
>
> Hi Carson!
>
> Am 02.03.2017 um 01:43 schrieb Carson Holt:
>> Try this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 echo Hello
>> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h
> Same error(s).
>
>> If both of these fail, there is the chance that the Intel MPI you are using was compiled on a different architecture than the one you are launching it on. In that case the failure indicates a need to reinstall Intel MPI for that architecture.
> Yes, they fail.
>
>> The following may or may not work if the first two fail:
>> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 echo Hello
> WORKS FINE!
>
>> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h
> WORKS!
>
>> Also send me this file —> perl/lib/MAKER/ConfigData.pm
> Attached to this mail.
>
>> Thanks,
>> Carson
>
> --
> Rainer Rutka
> University of Konstanz
> Communication, Information, Media Centre (KIM)
> * High-Performance-Computing (HPC)
> * KIM-Support and -Base-Services
> Room: V511
> 78457 Konstanz, Germany
> +49 7531 88-5413
> <ConfigData.pm>


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Maker-Error when started with IMPI : CORRECTED MAIL : SEE THIS ONE

Rainer Rutka

Hi Carson.

Again thank you for your response.

But - sorry to say - it's not possible our MPI is corrupt.
We have approx. 1.500 users working on our bwUniCluster so far. 95 %
of these users use MPI. And: All our other software (see:

cis-hpc.uni-konstanz.de )

is running with our implementations of IMPI/OMPI without any
issues.

:-()


Am 02.03.2017 um 18:41 schrieb Carson Holt:

> This command -> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 echo Hello
>
> All that command does is start the launcher and print “Hello”. So since it failed, it means the issue is with your MPI installation (i.e. Intel MPI itself). It would have to be reinstalled and recompiled. I would not be surprised if the issues with the other MPI flavors you tried were for the same reason.  They were installed for one architecture/compiler/library set, but you are running them on another one. So they always fail.
>
> The second command was an alternate launcher, but it relys on the same underlying libraries as the first one. So if the first one failed, the second one may fail (it may just happen later on).
>
>
> So the issue boils down to one thing —> Your MPI is the issue. You need to reinstall/reconfigure and once you can get your MPI working, you can move onto trying MAKER.
>
> Thanks,
> Carson
>
>
>
>> On Mar 2, 2017, at 1:41 AM, Rainer Rutka <[hidden email]> wrote:
>>
>> Hi Carson!
>>
>> Am 02.03.2017 um 01:43 schrieb Carson Holt:
>>> Try this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 echo Hello
>>> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h
>> Same error(s).
>>
>>> If both of these fail, there is the chance that the Intel MPI you are using was compiled on a different architecture than the one you are launching it on. In that case the failure indicates a need to reinstall Intel MPI for that architecture.
>> Yes, they fail.
>>
>>> The following may or may not work if the first two fail:
>>> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 echo Hello
>> WORKS FINE!
>>
>>> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h
>> WORKS!
>>
>>> Also send me this file —> perl/lib/MAKER/ConfigData.pm
>> Attached to this mail.
>>
>>> Thanks,
>>> Carson
>>
>> --
>> Rainer Rutka
>> University of Konstanz
>> Communication, Information, Media Centre (KIM)
>> * High-Performance-Computing (HPC)
>> * KIM-Support and -Base-Services
>> Room: V511
>> 78457 Konstanz, Germany
>> +49 7531 88-5413
>> <ConfigData.pm>
>
--
Rainer Rutka
Universität Konstanz
Kommunikations-, Informations-, Medienzentrum (KIM)
  * KIM Ausbildung
  * Wissenschaftliches Rechnen/bwHPC-C5
  * KIM Basisdienste, KIM Support
Raum: V511
78457 Konstanz
+49 7531 88-5413


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Maker-Error when started with IMPI : CORRECTED MAIL : SEE THIS ONE

Carson Holt-2
I was able to replicate the error as so —>

1. Intel MPI installed on CentOS kernel 6 (MPI works fine)
2. Upgrade to kernel 7 without reinstalling and Intel MPI reports the same error as reported by the user.
3. After recompiling Intel MPI on kernel 7 the error goes away.

The proof that there is an issue with your Intel MPI installation is in this command —>
/opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 echo Hello

That command is simply trying to get mpiexec to launch “echo Hello” internally. And it failed. It’s as simple as that.

Thanks,
Carson




> On Mar 6, 2017, at 1:21 AM, Rainer Rutka <[hidden email]> wrote:
>
>
> Hi Carson.
>
> Again thank you for your response.
>
> But - sorry to say - it's not possible our MPI is corrupt.
> We have approx. 1.500 users working on our bwUniCluster so far. 95 %
> of these users use MPI. And: All our other software (see:
>
> cis-hpc.uni-konstanz.de )
>
> is running with our implementations of IMPI/OMPI without any
> issues.
>
> :-()
>
>
> Am 02.03.2017 um 18:41 schrieb Carson Holt:
>> This command -> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 echo Hello
>>
>> All that command does is start the launcher and print “Hello”. So since it failed, it means the issue is with your MPI installation (i.e. Intel MPI itself). It would have to be reinstalled and recompiled. I would not be surprised if the issues with the other MPI flavors you tried were for the same reason.  They were installed for one architecture/compiler/library set, but you are running them on another one. So they always fail.
>>
>> The second command was an alternate launcher, but it relys on the same underlying libraries as the first one. So if the first one failed, the second one may fail (it may just happen later on).
>>
>>
>> So the issue boils down to one thing —> Your MPI is the issue. You need to reinstall/reconfigure and once you can get your MPI working, you can move onto trying MAKER.
>>
>> Thanks,
>> Carson
>>
>>
>>
>>> On Mar 2, 2017, at 1:41 AM, Rainer Rutka <[hidden email]> wrote:
>>>
>>> Hi Carson!
>>>
>>> Am 02.03.2017 um 01:43 schrieb Carson Holt:
>>>> Try this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 echo Hello
>>>> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h
>>> Same error(s).
>>>
>>>> If both of these fail, there is the chance that the Intel MPI you are using was compiled on a different architecture than the one you are launching it on. In that case the failure indicates a need to reinstall Intel MPI for that architecture.
>>> Yes, they fail.
>>>
>>>> The following may or may not work if the first two fail:
>>>> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 echo Hello
>>> WORKS FINE!
>>>
>>>> Then this command —> /opt/bwhpc/common/compiler/intel/compxe.2016.4.258/impi/5.1.3.223/intel64/bin/mpiexec.hydra -n 2 /opt/bwhpc/common/bio/maker/2.31.8_impi/bin/maker -h
>>> WORKS!
>>>
>>>> Also send me this file —> perl/lib/MAKER/ConfigData.pm
>>> Attached to this mail.
>>>
>>>> Thanks,
>>>> Carson
>>>
>>> --
>>> Rainer Rutka
>>> University of Konstanz
>>> Communication, Information, Media Centre (KIM)
>>> * High-Performance-Computing (HPC)
>>> * KIM-Support and -Base-Services
>>> Room: V511
>>> 78457 Konstanz, Germany
>>> +49 7531 88-5413
>>> <ConfigData.pm>
>>
>
> --
> Rainer Rutka
> Universität Konstanz
> Kommunikations-, Informations-, Medienzentrum (KIM)
> * KIM Ausbildung
> * Wissenschaftliches Rechnen/bwHPC-C5
> * KIM Basisdienste, KIM Support
> Raum: V511
> 78457 Konstanz
> +49 7531 88-5413
>


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Loading...