Dear all,
An error has occured between two command lines within one of our pipeline components, cf. attached file. We have no error message. Moreover this error does not occur each time we submit our pipeline. Does it exist a log file for such an error in order to understand this kind of error ? Thanks in advance for your answers. Kind regards, ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Ergatis-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/ergatis-users |
Laure -
There are a few places to look here. First, there is a log file in the same directory of your pipeline.xml called pipeline.xml.log and another one catching all the standard output called pipeline.xml.run.out. Is this pipeline running jobs on a grid? If so, each of those 'groups' listed corresponds to a job scheduled on your grid, and there are output files for each of them in case your job failed. The path to these is defined by the workflow_run_dir setting you have in your ergatis.ini. For me, that's "/usr/local/scratch/workflow". You can go to that directory and you'll find one folder per pipeline ID. Within that you'll find all the output and error files for each job submission, including those listed in the graphic above. Joshua 2012/1/17 Laure Devlamynck <[hidden email]> Dear all, ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Ergatis-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/ergatis-users |
Joshua,
Thank you for your help. We have found the output and error files. Unfortunately these files do not help us to understand why the pipeline have failed. kind regards, Laure Le 18/01/2012 16:50, Joshua Orvis a écrit : Laure - ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Ergatis-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/ergatis-users |
Laure, Are you running the pipeline over SGE? If so you might be able to get SGE job info from 'show group info' and look to see if something is failing in SGE. If the directory set in 'workflow_run_dir' from ergatis.ini didn't provide any clues, you can also check the working directory for the job: 'CWD' from workflow/server-conf/sge_mockserver.conf Cheers, Chris On Fri, 20 Jan 2012, Laure Devlamynck wrote: > Joshua, > > Thank you for your help. > We have found the output and error files. Unfortunately these files do not > help us to understand why the pipeline have failed. > > kind regards, > Laure > > > Le 18/01/2012 16:50, Joshua Orvis a écrit : >> Laure - >> >> There are a few places to look here. First, there is a log file in the >> same directory of your pipeline.xml called pipeline.xml.log and another >> one catching all the standard output called pipeline.xml.run.out. >> >> Is this pipeline running jobs on a grid? If so, each of those 'groups' >> listed corresponds to a job scheduled on your grid, and there are output >> files for each of them in case your job failed. The path to these is >> defined by the workflow_run_dir setting you have in your ergatis.ini. For >> me, that's "/usr/local/scratch/workflow". You can go to that directory >> and you'll find one folder per pipeline ID. Within that you'll find all >> the output and error files for each job submission, including those listed >> in the graphic above. >> >> Joshua >> >> >> >> 2012/1/17 Laure Devlamynck <[hidden email] >> <mailto:[hidden email]>> >> >> Dear all, >> >> An error has occured between two command lines within one of our >> pipeline components, cf. attached file. We have no error message. >> Moreover this error does not occur each time we submit our pipeline. >> Does it exist a log file for such an error in order to understand >> this kind of error ? >> >> Thanks in advance for your answers. >> >> Kind regards, >> >> ------------------------------------------------------------------------------ >> Keep Your Developer Skills Current with LearnDevNow! >> The most comprehensive online learning library for Microsoft >> developers >> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, >> MVC3, >> Metro Style Apps, more. Free future releases when you subscribe now! >> http://p.sf.net/sfu/learndevnow-d2d >> _______________________________________________ >> Ergatis-users mailing list >> [hidden email] >> <mailto:[hidden email]> >> https://lists.sourceforge.net/lists/listinfo/ergatis-users >> >> > > Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Ergatis-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/ergatis-users ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Ergatis-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/ergatis-users |
Chris,
Thank you for your help. Yes we are running the pipeline over SGE. The SGE job info from 'show group info' does not help to understand what happened (grid id = 0 and there is no workflow event log). See below an example of 'show group info' we get for a group 'failed' :
In the working directory, for the job, we have one subdirectory per group of the component. Contents of the subdirectories are described below : - Contents of subdirectories corresponding to a group 'complete' : total 64 -rw-rw-rw- 1 ng6 NG6 530 Jan 17 11:22 event.log -rw-r--r-- 1 ng6 NG6 5 Jan 17 11:19 pid.log -rwxr-xr-x 1 ng6 NG6 7577 Jan 17 11:17 sge_job.sh -rw-r--r-- 1 ng6 NG6 52 Jan 17 11:18 sge_submit.out - Contents of subdirectories corresponding to a group 'failed' : total 16 -rw-r--r-- 1 ng6 NG6 0 Jan 17 11:17 event.log -rw-r--r-- 1 ng6 NG6 7577 Jan 17 11:17 sge_job.sh We also observed that all subdirectories corresponding to a group 'complete' have the permissions 'drwxrwxrwx', whereas those corresponding to a group 'failed' have permissions 'drwxr-xr-x'. Kind regards, Laure Le 20/01/2012 18:48, Chris Hemmerich a écrit :
------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Ergatis-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/ergatis-users |
There are a few things that come to my mind. The first is that the qsub failed for the job. Because if qsub succeeded then it runs the prolog script which writes to event.log. Also, sge_submit.out would be there.
Is your pipeline.xml.og or equivalent file non-zero? If so can you search for this string "Submission Failure"
in that file. I have a feeling that the submission failed for some reason and we need to track that down.
From: Laure Devlamynck <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]> Date: Tue, 24 Jan 2012 18:16:41 +0100 To: Chris Hemmerich <[hidden email]> Cc: "[hidden email]" <[hidden email]>, Joshua Orvis <[hidden email]>, <[hidden email]> Subject: Re: [Ergatis-users] Component error log files Chris,
Thank you for your help. Yes we are running the pipeline over SGE. The SGE job info from 'show group info' does not help to understand what happened (grid id = 0 and there is no workflow event log). See below an example of 'show group info' we get for a group 'failed' :
In the working directory, for the job, we have one subdirectory per group of the component. Contents of the subdirectories are described below : - Contents of subdirectories corresponding to a group 'complete' : total 64 -rw-rw-rw- 1 ng6 NG6 530 Jan 17 11:22 event.log -rw-r--r-- 1 ng6 NG6 5 Jan 17 11:19 pid.log -rwxr-xr-x 1 ng6 NG6 7577 Jan 17 11:17 sge_job.sh -rw-r--r-- 1 ng6 NG6 52 Jan 17 11:18 sge_submit.out - Contents of subdirectories corresponding to a group 'failed' : total 16 -rw-r--r-- 1 ng6 NG6 0 Jan 17 11:17 event.log -rw-r--r-- 1 ng6 NG6 7577 Jan 17 11:17 sge_job.sh We also observed that all subdirectories corresponding to a group 'complete' have the permissions 'drwxrwxrwx', whereas those corresponding to a group 'failed' have permissions 'drwxr-xr-x'. Kind regards, Laure Le 20/01/2012 18:48, Chris Hemmerich a écrit :
------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Ergatis-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/ergatis-users |
Anup,
You are right, the submission failed for each group 'failed'. Below the 'Submission Failure' in the pipeline.xml.log. There is a 'Submission Failure' for each group 'failed'. We have rerun the sge_job.sh for a group 'failed' without any problem. Thank you, Laure DEBUG 11:17:53:773 [Thread: HTC ID: 7239862 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:53:802 [Thread: HTC ID: 7239862 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:54:329 [Thread: HTC ID: 7239863 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:54:330 [Thread: HTC ID: 7239863 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:54:875 [Thread: HTC ID: 7239864 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:54:875 [Thread: HTC ID: 7239864 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:55:317 [Thread: HTC ID: 7239865 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:55:318 [Thread: HTC ID: 7239865 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:55:964 [Thread: HTC ID: 7239866 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:55:964 [Thread: HTC ID: 7239866 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:56:340 [Thread: HTC ID: 7239867 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:56:340 [Thread: HTC ID: 7239867 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:56:617 [Thread: HTC ID: 7239868 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:56:617 [Thread: HTC ID: 7239868 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:56:950 [Thread: HTC ID: 7239869 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:56:950 [Thread: HTC ID: 7239869 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:57:163 [Thread: HTC ID: 7239870 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:57:163 [Thread: HTC ID: 7239870 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:57:627 [Thread: HTC ID: 7239871 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:57:628 [Thread: HTC ID: 7239871 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:57:839 [Thread: HTC ID: 7239872 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:57:839 [Thread: HTC ID: 7239872 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:58:355 [Thread: HTC ID: 7239873 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:58:356 [Thread: HTC ID: 7239873 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:58:740 [Thread: HTC ID: 7239874 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:58:741 [Thread: HTC ID: 7239874 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure DEBUG 11:17:58:960 [Thread: HTC ID: 7239875 runner] SGERunner run:350 Error submitting job to the Grid Submission Failure WARN 11:17:58:960 [Thread: HTC ID: 7239875 runner] SGERunner run:372 SGE Runner error Submission Failure java.lang.Exception: Submission Failure Le 24/01/2012 21:18, Mahurkar, Anup a écrit :
------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Ergatis-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/ergatis-users |
Free forum by Nabble | Edit this page |