Component error log files

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Component error log files

Laure Devlamynck
Dear all,

An error has occured between two command lines within one of our
pipeline components, cf. attached file. We have no error message.
Moreover this error does not occur each time we submit our pipeline.
Does it exist a log file for such an error in order to understand this
kind of error ?

Thanks in advance for your answers.

Kind regards,

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users

test_work_gpfs_ergatis_bwa_contamination.png (31K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Component error log files

Joshua Orvis
Laure -

There are a few places to look here.  First, there is a log file in the same directory of your pipeline.xml called pipeline.xml.log and another one catching all the standard output called pipeline.xml.run.out.

Is this pipeline running jobs on a grid?  If so, each of those 'groups' listed corresponds to a job scheduled on your grid, and there are output files for each of them in case your job failed.  The path to these is defined by the workflow_run_dir setting you have in your ergatis.ini.  For me, that's "/usr/local/scratch/workflow".  You can go to that directory and you'll find one folder per pipeline ID.  Within that you'll find all the output and error files for each job submission, including those listed in the graphic above. 

Joshua



2012/1/17 Laure Devlamynck <[hidden email]>
Dear all,

An error has occured between two command lines within one of our pipeline components, cf. attached file. We have no error message. Moreover this error does not occur each time we submit our pipeline.
Does it exist a log file for such an error in order to understand this kind of error ?

Thanks in advance for your answers.

Kind regards,

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users



------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users
Reply | Threaded
Open this post in threaded view
|

Re: Component error log files

Laure Devlamynck
Joshua,

Thank you for your help.
We have found the output and error files. Unfortunately these files do not help us to understand why the pipeline have failed.

kind regards,
Laure


Le 18/01/2012 16:50, Joshua Orvis a écrit :
Laure -

There are a few places to look here.  First, there is a log file in the same directory of your pipeline.xml called pipeline.xml.log and another one catching all the standard output called pipeline.xml.run.out.

Is this pipeline running jobs on a grid?  If so, each of those 'groups' listed corresponds to a job scheduled on your grid, and there are output files for each of them in case your job failed.  The path to these is defined by the workflow_run_dir setting you have in your ergatis.ini.  For me, that's "/usr/local/scratch/workflow".  You can go to that directory and you'll find one folder per pipeline ID.  Within that you'll find all the output and error files for each job submission, including those listed in the graphic above. 

Joshua



2012/1/17 Laure Devlamynck <[hidden email]>
Dear all,

An error has occured between two command lines within one of our pipeline components, cf. attached file. We have no error message. Moreover this error does not occur each time we submit our pipeline.
Does it exist a log file for such an error in order to understand this kind of error ?

Thanks in advance for your answers.

Kind regards,

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users




------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users
Reply | Threaded
Open this post in threaded view
|

Re: Component error log files

Chris Hemmerich

Laure,

  Are you running the pipeline over SGE? If so you might be able to get SGE
job info from 'show group info' and look to see if something is failing in
SGE.

If the directory set in 'workflow_run_dir' from ergatis.ini didn't provide
any clues, you can also check the working directory for the job:

'CWD' from workflow/server-conf/sge_mockserver.conf

Cheers,
  Chris

On Fri, 20 Jan 2012, Laure Devlamynck wrote:

> Joshua,
>
> Thank you for your help.
> We have found the output and error files. Unfortunately these files do not
> help us to understand why the pipeline have failed.
>
> kind regards,
> Laure
>
>
> Le 18/01/2012 16:50, Joshua Orvis a écrit :
>>  Laure -
>>
>>  There are a few places to look here.  First, there is a log file in the
>>  same directory of your pipeline.xml called pipeline.xml.log and another
>>  one catching all the standard output called pipeline.xml.run.out.
>>
>>  Is this pipeline running jobs on a grid?  If so, each of those 'groups'
>>  listed corresponds to a job scheduled on your grid, and there are output
>>  files for each of them in case your job failed.  The path to these is
>>  defined by the workflow_run_dir setting you have in your ergatis.ini.  For
>>  me, that's "/usr/local/scratch/workflow".  You can go to that directory
>>  and you'll find one folder per pipeline ID.  Within that you'll find all
>>  the output and error files for each job submission, including those listed
>>  in the graphic above.
>>
>>  Joshua
>>
>>
>>
>>  2012/1/17 Laure Devlamynck <[hidden email]
>>  <mailto:[hidden email]>>
>>
>>      Dear all,
>>
>>      An error has occured between two command lines within one of our
>>      pipeline components, cf. attached file. We have no error message.
>>      Moreover this error does not occur each time we submit our pipeline.
>>      Does it exist a log file for such an error in order to understand
>>      this kind of error ?
>>
>>      Thanks in advance for your answers.
>>
>>      Kind regards,
>>
>>      ------------------------------------------------------------------------------
>>      Keep Your Developer Skills Current with LearnDevNow!
>>      The most comprehensive online learning library for Microsoft
>>      developers
>>      is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3,
>>      MVC3,
>>      Metro Style Apps, more. Free future releases when you subscribe now!
>>      http://p.sf.net/sfu/learndevnow-d2d
>>      _______________________________________________
>>      Ergatis-users mailing list
>>      [hidden email]
>>      <mailto:[hidden email]>
>>      https://lists.sourceforge.net/lists/listinfo/ergatis-users
>>
>>
>
>
------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users
Reply | Threaded
Open this post in threaded view
|

Re: Component error log files

Laure Devlamynck
Chris,

Thank you for your help.

Yes we are running the pipeline over SGE. The SGE job info from 'show group info' does not help to understand what happened (grid id = 0 and there is no workflow event log). See below an example of 'show group info' we get for a group 'failed' :

workflow command id: 7225769
state: failed
start time: Tue Jan 17 11:17:52 2012
end time: Tue Jan 17 11:17:53 2012
duration: 1 sec
grid id: 0
workflow grid id: 7239862
workflow event log: ?
remote wf stderr/sdout :/mnt/wf-working/RunWorkflow.o0,/mnt/wf-working/RunWorkflow.e0
prolog/epilog stderr/stdout /home/guest/staging.*,home/guest/harvesting.*
xml:
/work/ng6/ergatis/workflow/runtime/bwa_contamination_search/67_default/i1/g137/g137.xml.gz


In the working directory, for the job, we have one subdirectory per group of the component. Contents of the subdirectories are described below :

- Contents of subdirectories corresponding to a group 'complete' :
total 64
-rw-rw-rw- 1 ng6 NG6  530 Jan 17 11:22 event.log
-rw-r--r-- 1 ng6 NG6    5 Jan 17 11:19 pid.log
-rwxr-xr-x 1 ng6 NG6 7577 Jan 17 11:17 sge_job.sh
-rw-r--r-- 1 ng6 NG6   52 Jan 17 11:18 sge_submit.out

- Contents of subdirectories corresponding to a group 'failed' :
total 16
-rw-r--r-- 1 ng6 NG6    0 Jan 17 11:17 event.log
-rw-r--r-- 1 ng6 NG6 7577 Jan 17 11:17 sge_job.sh

We also observed that all subdirectories corresponding to a group 'complete' have the permissions 'drwxrwxrwx', whereas those corresponding to a group 'failed' have permissions 'drwxr-xr-x'.


Kind regards,
Laure



Le 20/01/2012 18:48, Chris Hemmerich a écrit :

Laure,

 Are you running the pipeline over SGE? If so you might be able to get SGE job info from 'show group info' and look to see if something is failing in SGE.

If the directory set in 'workflow_run_dir' from ergatis.ini didn't provide any clues, you can also check the working directory for the job:

'CWD' from workflow/server-conf/sge_mockserver.conf

Cheers,
 Chris

On Fri, 20 Jan 2012, Laure Devlamynck wrote:

Joshua,

Thank you for your help.
We have found the output and error files. Unfortunately these files do not help us to understand why the pipeline have failed.

kind regards,
Laure


Le 18/01/2012 16:50, Joshua Orvis a écrit :
 Laure -

 There are a few places to look here.  First, there is a log file in the
 same directory of your pipeline.xml called pipeline.xml.log and another
 one catching all the standard output called pipeline.xml.run.out.

 Is this pipeline running jobs on a grid?  If so, each of those 'groups'
 listed corresponds to a job scheduled on your grid, and there are output
 files for each of them in case your job failed.  The path to these is
 defined by the workflow_run_dir setting you have in your ergatis.ini.  For
 me, that's "/usr/local/scratch/workflow".  You can go to that directory
 and you'll find one folder per pipeline ID.  Within that you'll find all
 the output and error files for each job submission, including those listed
 in the graphic above.

 Joshua



 2012/1/17 Laure Devlamynck <[hidden email]
 [hidden email]>

     Dear all,

     An error has occured between two command lines within one of our
     pipeline components, cf. attached file. We have no error message.
     Moreover this error does not occur each time we submit our pipeline.
     Does it exist a log file for such an error in order to understand
     this kind of error ?

     Thanks in advance for your answers.

     Kind regards,

     ------------------------------------------------------------------------------
     Keep Your Developer Skills Current with LearnDevNow!
     The most comprehensive online learning library for Microsoft
     developers
     is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3,
     MVC3,
     Metro Style Apps, more. Free future releases when you subscribe now!
     http://p.sf.net/sfu/learndevnow-d2d
     _______________________________________________
     Ergatis-users mailing list
     [hidden email]
     [hidden email]
     https://lists.sourceforge.net/lists/listinfo/ergatis-users






------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d


_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users


------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users
Reply | Threaded
Open this post in threaded view
|

Re: Component error log files

Mahurkar, Anup
There are a few things that come to my mind. The first is that the qsub failed for the job. Because if qsub succeeded then it runs the prolog script which writes to event.log. Also, sge_submit.out would be there. 

Is your pipeline.xml.og or equivalent file non-zero? If so can you search for this string "Submission Failure" in that file. I have a feeling that the submission failed for some reason and we need to track that down.

From: Laure Devlamynck <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Tue, 24 Jan 2012 18:16:41 +0100
To: Chris Hemmerich <[hidden email]>
Cc: "[hidden email]" <[hidden email]>, Joshua Orvis <[hidden email]>, <[hidden email]>
Subject: Re: [Ergatis-users] Component error log files

Chris,

Thank you for your help.

Yes we are running the pipeline over SGE. The SGE job info from 'show group info' does not help to understand what happened (grid id = 0 and there is no workflow event log). See below an example of 'show group info' we get for a group 'failed' :

workflow command id: 7225769
state: failed
start time: Tue Jan 17 11:17:52 2012
end time: Tue Jan 17 11:17:53 2012
duration: 1 sec
grid id: 0
workflow grid id: 7239862
workflow event log: ?
remote wf stderr/sdout :/mnt/wf-working/RunWorkflow.o0,/mnt/wf-working/RunWorkflow.e0
prolog/epilog stderr/stdout /home/guest/staging.*,home/guest/harvesting.*
xml:
/work/ng6/ergatis/workflow/runtime/bwa_contamination_search/67_default/i1/g137/g137.xml.gz


In the working directory, for the job, we have one subdirectory per group of the component. Contents of the subdirectories are described below :

- Contents of subdirectories corresponding to a group 'complete' :
total 64
-rw-rw-rw- 1 ng6 NG6  530 Jan 17 11:22 event.log
-rw-r--r-- 1 ng6 NG6    5 Jan 17 11:19 pid.log
-rwxr-xr-x 1 ng6 NG6 7577 Jan 17 11:17 sge_job.sh
-rw-r--r-- 1 ng6 NG6   52 Jan 17 11:18 sge_submit.out

- Contents of subdirectories corresponding to a group 'failed' :
total 16
-rw-r--r-- 1 ng6 NG6    0 Jan 17 11:17 event.log
-rw-r--r-- 1 ng6 NG6 7577 Jan 17 11:17 sge_job.sh

We also observed that all subdirectories corresponding to a group 'complete' have the permissions 'drwxrwxrwx', whereas those corresponding to a group 'failed' have permissions 'drwxr-xr-x'.


Kind regards,
Laure



Le 20/01/2012 18:48, Chris Hemmerich a écrit :

Laure,

 Are you running the pipeline over SGE? If so you might be able to get SGE job info from 'show group info' and look to see if something is failing in SGE.

If the directory set in 'workflow_run_dir' from ergatis.ini didn't provide any clues, you can also check the working directory for the job:

'CWD' from workflow/server-conf/sge_mockserver.conf

Cheers,
 Chris

On Fri, 20 Jan 2012, Laure Devlamynck wrote:

Joshua,

Thank you for your help.
We have found the output and error files. Unfortunately these files do not help us to understand why the pipeline have failed.

kind regards,
Laure


Le 18/01/2012 16:50, Joshua Orvis a écrit :
 Laure -

 There are a few places to look here.  First, there is a log file in the
 same directory of your pipeline.xml called pipeline.xml.log and another
 one catching all the standard output called pipeline.xml.run.out.

 Is this pipeline running jobs on a grid?  If so, each of those 'groups'
 listed corresponds to a job scheduled on your grid, and there are output
 files for each of them in case your job failed.  The path to these is
 defined by the workflow_run_dir setting you have in your ergatis.ini.  For
 me, that's "/usr/local/scratch/workflow".  You can go to that directory
 and you'll find one folder per pipeline ID.  Within that you'll find all
 the output and error files for each job submission, including those listed
 in the graphic above.

 Joshua



 2012/1/17 Laure Devlamynck <[hidden email]
 [hidden email]>

     Dear all,

     An error has occured between two command lines within one of our
     pipeline components, cf. attached file. We have no error message.
     Moreover this error does not occur each time we submit our pipeline.
     Does it exist a log file for such an error in order to understand
     this kind of error ?

     Thanks in advance for your answers.

     Kind regards,

     ------------------------------------------------------------------------------
     Keep Your Developer Skills Current with LearnDevNow!
     The most comprehensive online learning library for Microsoft
     developers
     is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3,
     MVC3,
     Metro Style Apps, more. Free future releases when you subscribe now!
     http://p.sf.net/sfu/learndevnow-d2d
     _______________________________________________
     Ergatis-users mailing list
     [hidden email]
     [hidden email]
     https://lists.sourceforge.net/lists/listinfo/ergatis-users






------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d


_______________________________________________
Ergatis-users mailing list
[hidden email]https://lists.sourceforge.net/lists/listinfo/ergatis-users

------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d_______________________________________________ Ergatis-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/ergatis-users

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users
Reply | Threaded
Open this post in threaded view
|

Re: Component error log files

Laure Devlamynck
Anup,

You are right, the submission failed for each group 'failed'. Below the 'Submission Failure' in the pipeline.xml.log. There is a 'Submission Failure' for each group 'failed'.
We have rerun the sge_job.sh for a group 'failed' without any problem.

Thank you,
Laure

DEBUG 11:17:53:773 [Thread: HTC ID: 7239862 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:53:802 [Thread: HTC ID: 7239862 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:54:329 [Thread: HTC ID: 7239863 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:54:330 [Thread: HTC ID: 7239863 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:54:875 [Thread: HTC ID: 7239864 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:54:875 [Thread: HTC ID: 7239864 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:55:317 [Thread: HTC ID: 7239865 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:55:318 [Thread: HTC ID: 7239865 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:55:964 [Thread: HTC ID: 7239866 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:55:964 [Thread: HTC ID: 7239866 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:56:340 [Thread: HTC ID: 7239867 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:56:340 [Thread: HTC ID: 7239867 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:56:617 [Thread: HTC ID: 7239868 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:56:617 [Thread: HTC ID: 7239868 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:56:950 [Thread: HTC ID: 7239869 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:56:950 [Thread: HTC ID: 7239869 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:57:163 [Thread: HTC ID: 7239870 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:57:163 [Thread: HTC ID: 7239870 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:57:627 [Thread: HTC ID: 7239871 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:57:628 [Thread: HTC ID: 7239871 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:57:839 [Thread: HTC ID: 7239872 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:57:839 [Thread: HTC ID: 7239872 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:58:355 [Thread: HTC ID: 7239873 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:58:356 [Thread: HTC ID: 7239873 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:58:740 [Thread: HTC ID: 7239874 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:58:741 [Thread: HTC ID: 7239874 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure
DEBUG 11:17:58:960 [Thread: HTC ID: 7239875 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure
WARN 11:17:58:960 [Thread: HTC ID: 7239875 runner] SGERunner run:372 SGE Runner error Submission Failure
java.lang.Exception: Submission Failure





Le 24/01/2012 21:18, Mahurkar, Anup a écrit :
There are a few things that come to my mind. The first is that the qsub failed for the job. Because if qsub succeeded then it runs the prolog script which writes to event.log. Also, sge_submit.out would be there. 

Is your pipeline.xml.og or equivalent file non-zero? If so can you search for this string "Submission Failure" in that file. I have a feeling that the submission failed for some reason and we need to track that down.

From: Laure Devlamynck <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Tue, 24 Jan 2012 18:16:41 +0100
To: Chris Hemmerich <[hidden email]>
Cc: "[hidden email]" <[hidden email]>, Joshua Orvis <[hidden email]>, <[hidden email]>
Subject: Re: [Ergatis-users] Component error log files

Chris,

Thank you for your help.

Yes we are running the pipeline over SGE. The SGE job info from 'show group info' does not help to understand what happened (grid id = 0 and there is no workflow event log). See below an example of 'show group info' we get for a group 'failed' :

workflow command id: 7225769
state: failed
start time: Tue Jan 17 11:17:52 2012
end time: Tue Jan 17 11:17:53 2012
duration: 1 sec
grid id: 0
workflow grid id: 7239862
workflow event log: ?
remote wf stderr/sdout :/mnt/wf-working/RunWorkflow.o0,/mnt/wf-working/RunWorkflow.e0
prolog/epilog stderr/stdout /home/guest/staging.*,home/guest/harvesting.*
xml:
/work/ng6/ergatis/workflow/runtime/bwa_contamination_search/67_default/i1/g137/g137.xml.gz


In the working directory, for the job, we have one subdirectory per group of the component. Contents of the subdirectories are described below :

- Contents of subdirectories corresponding to a group 'complete' :
total 64
-rw-rw-rw- 1 ng6 NG6  530 Jan 17 11:22 event.log
-rw-r--r-- 1 ng6 NG6    5 Jan 17 11:19 pid.log
-rwxr-xr-x 1 ng6 NG6 7577 Jan 17 11:17 sge_job.sh
-rw-r--r-- 1 ng6 NG6   52 Jan 17 11:18 sge_submit.out

- Contents of subdirectories corresponding to a group 'failed' :
total 16
-rw-r--r-- 1 ng6 NG6    0 Jan 17 11:17 event.log
-rw-r--r-- 1 ng6 NG6 7577 Jan 17 11:17 sge_job.sh

We also observed that all subdirectories corresponding to a group 'complete' have the permissions 'drwxrwxrwx', whereas those corresponding to a group 'failed' have permissions 'drwxr-xr-x'.


Kind regards,
Laure



Le 20/01/2012 18:48, Chris Hemmerich a écrit :

Laure,

 Are you running the pipeline over SGE? If so you might be able to get SGE job info from 'show group info' and look to see if something is failing in SGE.

If the directory set in 'workflow_run_dir' from ergatis.ini didn't provide any clues, you can also check the working directory for the job:

'CWD' from workflow/server-conf/sge_mockserver.conf

Cheers,
 Chris

On Fri, 20 Jan 2012, Laure Devlamynck wrote:

Joshua,

Thank you for your help.
We have found the output and error files. Unfortunately these files do not help us to understand why the pipeline have failed.

kind regards,
Laure


Le 18/01/2012 16:50, Joshua Orvis a écrit :
 Laure -

 There are a few places to look here.  First, there is a log file in the
 same directory of your pipeline.xml called pipeline.xml.log and another
 one catching all the standard output called pipeline.xml.run.out.

 Is this pipeline running jobs on a grid?  If so, each of those 'groups'
 listed corresponds to a job scheduled on your grid, and there are output
 files for each of them in case your job failed.  The path to these is
 defined by the workflow_run_dir setting you have in your ergatis.ini.  For
 me, that's "/usr/local/scratch/workflow".  You can go to that directory
 and you'll find one folder per pipeline ID.  Within that you'll find all
 the output and error files for each job submission, including those listed
 in the graphic above.

 Joshua



 2012/1/17 Laure Devlamynck <[hidden email]
 [hidden email]>

     Dear all,

     An error has occured between two command lines within one of our
     pipeline components, cf. attached file. We have no error message.
     Moreover this error does not occur each time we submit our pipeline.
     Does it exist a log file for such an error in order to understand
     this kind of error ?

     Thanks in advance for your answers.

     Kind regards,

     ------------------------------------------------------------------------------
     Keep Your Developer Skills Current with LearnDevNow!
     The most comprehensive online learning library for Microsoft
     developers
     is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3,
     MVC3,
     Metro Style Apps, more. Free future releases when you subscribe now!
     http://p.sf.net/sfu/learndevnow-d2d
     _______________________________________________
     Ergatis-users mailing list
     [hidden email]
     [hidden email]
     https://lists.sourceforge.net/lists/listinfo/ergatis-users






------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d


_______________________________________________
Ergatis-users mailing list
[hidden email]https://lists.sourceforge.net/lists/listinfo/ergatis-users

------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d_______________________________________________ Ergatis-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/ergatis-users


------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users