GenemarkES install/bug problems

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

GenemarkES install/bug problems

katebush-2
Hello,

Trying to install GenemarkES for linux 64 bit.  First does anyone know
how to install the license key.  That was my first error.  I untarred
the gm_key_64.tar, copied to my home dir?  When running a testrun on
the maker test contig:

/usr/local/bin/gm_es_bp_linux64_v2.3a/gm_es.pl  dpp.contigs

I get the following errors. It doesn't seem to be formatting a
dna.fa.good.taa...at least there is nothing in it.  Any ideas on what
the problem may be greatly appreciated.

running hmm2nt.a2
1 files IN
Clusters were defined as:
 0 <= GC% <= 99
99 < GC% <= 99
99 < GC% <= 100

Parsing dna.fa.good.cod

Program complete
----------------
1 sequences found
2 dna.fa.good.ini
zero order for Ini
GC Range: (0,99)
1 sequences of length 12 used from 1
total sequences in dna.fa.good.ini
Generating model...
T    0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00
C    0.00 1.00 0.00 0.00 1.00 1.00 0.00 0.00 0.00 1.00 0.00 1.00
A    0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00
G    1.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 1.00 0.00
Done
2 lines read from dna.fa.good.ter
1 sequences obtained
1 comment lines
0 lines contained no sequence (or improperly formatted seq)
0 sequences used TAA
0 sequences used TAG
1 sequences used TGA
0 sequences did not begin with a stop codon
All lines accounted for
Done
0 dna.fa.good.taa
zero order for TAA
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104.
0 sequences of length  used from 0
total sequences in dna.fa.good.taa
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
G
Done
0 dna.fa.good.tag
zero order for TAG
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104.
0 sequences of length  used from 0
total sequences in dna.fa.good.tag
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
G
Done
1 dna.fa.good.tga
zero order for TGA
GC Range: (0,99)
1 sequences of length 12 used from 1
total sequences in dna.fa.good.tga
Generating model...
T    1.00 0.00 0.00 1.00 1.00 0.00 0.00 1.00 1.00 1.00 1.00 0.00
C    0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00
A    0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
G    0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00
Done
1 dna.fa.good.don
zero order for DON
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104, <IN> line 1.
0 sequences of length  used from 0
total sequences in dna.fa.good.don
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110, <IN> line 1.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
G
Done
1 dna.fa.good.acc
zero order for ACC
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104, <IN> line 1.
0 sequences of length  used from 0
total sequences in dna.fa.good.acc
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110, <IN> line 1.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
G
Done
error reading parameter TERM_TAA_MAT
error in model file org_S1.0mtx
Error on system: prediction step

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: GenemarkES install/bug problems

Carson Hinton Holt
Re: [maker-devel] GenemarkES install/bug problems When you register to download GeneMark there will be a second link on the download page to download the license key.  There are two keys, one is 32 bit and one is 64 bit.  You must download the right key for your version of GeneMark.  Once you have the key, unpack it and move it to your home directory.  Then rename the key to .gm_key i.e. -> mv  gm_key  .gm_key

gm_es.pl generally requires at least 10 megabases of sequence to train according to the GeneMark documentation.  So the dpp_contig.fasta file is far to short, and you will probably get an error.  Just some info on how GeneMark works. The script gm_es.pl is actually a training script, and is not the prediction executable.  There is another executable called gmhmme3 that is the actual ab initio gene predictor and it requires that you provide it with a training file.  When you run gm_es.pl it will try and create a training file for you and will then run gmhmme3 (this is done silently, so if you are unfamiliar with the traditional ‘GeneMark.hmm eukaryotic’ you might not even know this is happening).

Why use gmhmme3 instead of always just using gm_es.pl which will eventually call gmhmme3 for you?  The reason is because gm_es.pl can take up to 24 hours to build a training file, whereas gmhmme3 with an existing training file takes less than 2 minutes to run.  You can build training files using the self training script gm_es.pl as already mentioned, or using pre-existing gene models as input which is done with ‘GeneMark.hmm eukaryotic’ .

The file you want when gm_es.pl is finished will be in a directory called mod/ as part of gm_es.pl’s output.  You will pass that training file to MAKER, and MAKER can run gmhmme3 to produce ab initio gene predictions in any way it wants.

If you want a list of precomputed training files, download ‘GeneMark.hmm eukaryotic’.  It comes with a number of .mod files that can be used with the gmhmme3 executable from GeneMark-ES.

I hope that helps.  Just a couple more notes.  GeneMark does not work on Macs, and on some versions of Linux, it produces a segmentation fault.  There is really no way to get around those errors because GeneMark is distributed as a precomputed binary package and not as source code that can be compiled for your machine.

Thanks,
Carson

On 5/20/10 6:49 PM, "katebush" <kbushley@...> wrote:

Hello,

Trying to install GenemarkES for linux 64 bit.  First does anyone know
how to install the license key.  That was my first error.  I untarred
the gm_key_64.tar, copied to my home dir?  When running a testrun on
the maker test contig:

/usr/local/bin/gm_es_bp_linux64_v2.3a/gm_es.pl  dpp.contigs

I get the following errors. It doesn't seem to be formatting a
dna.fa.good.taa...at least there is nothing in it.  Any ideas on what
the problem may be greatly appreciated.

running hmm2nt.a2
1 files IN
Clusters were defined as:
 0 <= GC% <= 99
99 < GC% <= 99
99 < GC% <= 100

Parsing dna.fa.good.cod

Program complete
----------------
1 sequences found
2 dna.fa.good.ini
zero order for Ini
GC Range: (0,99)
1 sequences of length 12 used from 1
total sequences in dna.fa.good.ini
Generating model...
T    0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00
C    0.00 1.00 0.00 0.00 1.00 1.00 0.00 0.00 0.00 1.00 0.00 1.00
A    0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00
G    1.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 1.00 0.00
Done
2 lines read from dna.fa.good.ter
1 sequences obtained
1 comment lines
0 lines contained no sequence (or improperly formatted seq)
0 sequences used TAA
0 sequences used TAG
1 sequences used TGA
0 sequences did not begin with a stop codon
All lines accounted for
Done
0 dna.fa.good.taa
zero order for TAA
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104.
0 sequences of length  used from 0
total sequences in dna.fa.good.taa
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
G
Done
0 dna.fa.good.tag
zero order for TAG
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104.
0 sequences of length  used from 0
total sequences in dna.fa.good.tag
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
G
Done
1 dna.fa.good.tga
zero order for TGA
GC Range: (0,99)
1 sequences of length 12 used from 1
total sequences in dna.fa.good.tga
Generating model...
T    1.00 0.00 0.00 1.00 1.00 0.00 0.00 1.00 1.00 1.00 1.00 0.00
C    0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00
A    0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
G    0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00
Done
1 dna.fa.good.don
zero order for DON
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104, <IN> line 1.
0 sequences of length  used from 0
total sequences in dna.fa.good.don
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110, <IN> line 1.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
G
Done
1 dna.fa.good.acc
zero order for ACC
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104, <IN> line 1.
0 sequences of length  used from 0
total sequences in dna.fa.good.acc
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110, <IN> line 1.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
G
Done
error reading parameter TERM_TAA_MAT
error in model file org_S1.0mtx
Error on system: prediction step

_______________________________________________
maker-devel mailing list
maker-devel@...
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: GenemarkES install/bug problems

katebush-2
Thanks, very helpful.  I found a site with documentation for Genemark v.2.4 on the web...is there any for GeneMark.hmm eukaryotic or GeneMark-ES anywhere other than what's included in the tarball.  I don't remember seeing any when I downloaded.

best,

Kathryn


From: Carson Holt <[hidden email]>
To: katebush <[hidden email]>; "[hidden email]" <[hidden email]>
Sent: Thu, May 20, 2010 9:22:28 PM
Subject: Re: [maker-devel] GenemarkES install/bug problems

Re: [maker-devel] GenemarkES install/bug problems When you register to download GeneMark there will be a second link on the download page to download the license key.  There are two keys, one is 32 bit and one is 64 bit.  You must download the right key for your version of GeneMark.  Once you have the key, unpack it and move it to your home directory.  Then rename the key to .gm_key i.e. -> mv  gm_key  .gm_key

gm_es.pl generally requires at least 10 megabases of sequence to train according to the GeneMark documentation.  So the dpp_contig.fasta file is far to short, and you will probably get an error.  Just some info on how GeneMark works. The script gm_es.pl is actually a training script, and is not the prediction executable.  There is another executable called gmhmme3 that is the actual ab initio gene predictor and it requires that you provide it with a training file.  When you run gm_es.pl it will try and create a training file for you and will then run gmhmme3 (this is done silently, so if you are unfamiliar with the traditional ‘GeneMark.hmm eukaryotic’ you might not even know this is happening).

Why use gmhmme3 instead of always just using gm_es.pl which will eventually call gmhmme3 for you?  The reason is because gm_es.pl can take up to 24 hours to build a training file, whereas gmhmme3 with an existing training file takes less than 2 minutes to run.  You can build training files using the self training script gm_es.pl as already mentioned, or using pre-existing gene models as input which is done with ‘GeneMark.hmm eukaryotic’ .

The file you want when gm_es.pl is finished will be in a directory called mod/ as part of gm_es.pl’s output.  You will pass that training file to MAKER, and MAKER can run gmhmme3 to produce ab initio gene predictions in any way it wants.

If you want a list of precomputed training files, download ‘GeneMark.hmm eukaryotic’.  It comes with a number of .mod files that can be used with the gmhmme3 executable from GeneMark-ES.

I hope that helps.  Just a couple more notes.  GeneMark does not work on Macs, and on some versions of Linux, it produces a segmentation fault.  There is really no way to get around those errors because GeneMark is distributed as a precomputed binary package and not as source code that can be compiled for your machine.

Thanks,
Carson

On 5/20/10 6:49 PM, "katebush" <[hidden email]> wrote:

Hello,

Trying to install GenemarkES for linux 64 bit.  First does anyone know
how to install the license key.  That was my first error.  I untarred
the gm_key_64.tar, copied to my home dir?  When running a testrun on
the maker test contig:

/usr/local/bin/gm_es_bp_linux64_v2.3a/gm_es.pl  dpp.contigs

I get the following errors. It doesn't seem to be formatting a
dna.fa.good.taa...at least there is nothing in it.  Any ideas on what
the problem may be greatly appreciated.

running hmm2nt.a2
1 files IN
Clusters were defined as:
 0 <= GC% <= 99
99 < GC% <= 99
99 < GC% <= 100

Parsing dna.fa.good.cod

Program complete
----------------
1 sequences found
2 dna.fa.good.ini
zero order for Ini
GC Range: (0,99)
1 sequences of length 12 used from 1
total sequences in dna.fa.good.ini
Generating model...
T    0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00
C    0.00 1.00 0.00 0.00 1.00 1.00 0.00 0.00 0.00 1.00 0.00 1.00
A    0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00
G    1.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 1.00 0.00
Done
2 lines read from dna.fa.good.ter
1 sequences obtained
1 comment lines
0 lines contained no sequence (or improperly formatted seq)
0 sequences used TAA
0 sequences used TAG
1 sequences used TGA
0 sequences did not begin with a stop codon
All lines accounted for
Done
0 dna.fa.good.taa
zero order for TAA
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104.
0 sequences of length  used from 0
total sequences in dna.fa.good.taa
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
G
Done
0 dna.fa.good.tag
zero order for TAG
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104.
0 sequences of length  used from 0
total sequences in dna.fa.good.tag
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
G
Done
1 dna.fa.good.tga
zero order for TGA
GC Range: (0,99)
1 sequences of length 12 used from 1
total sequences in dna.fa.good.tga
Generating model...
T    1.00 0.00 0.00 1.00 1.00 0.00 0.00 1.00 1.00 1.00 1.00 0.00
C    0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00
A    0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
G    0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00
Done
1 dna.fa.good.don
zero order for DON
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104, <IN> line 1.
0 sequences of length  used from 0
total sequences in dna.fa.good.don
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110, <IN> line 1.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
G
Done
1 dna.fa.good.acc
zero order for ACC
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104, <IN> line 1.
0 sequences of length  used from 0
total sequences in dna.fa.good.acc
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110, <IN> line 1.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
G
Done
error reading parameter TERM_TAA_MAT
error in model file org_S1.0mtx
Error on system: prediction step

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org



_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: GenemarkES install/bug problems

Carson Hinton Holt
It's really just whatever you can find in the tarball and the documentation they supply on the genemark website.  I know it can be limited, but you can also e-mail any questions you have, and I will do my best to answer them.

Carson


On May 21, 2010, at 2:49 PM, "Kathryn Bushley" <[hidden email]> wrote:

Thanks, very helpful.  I found a site with documentation for Genemark v.2.4 on the web...is there any for GeneMark.hmm eukaryotic or GeneMark-ES anywhere other than what's included in the tarball.  I don't remember seeing any when I downloaded.

best,

Kathryn


From: Carson Holt <[hidden email]>
To: katebush <[hidden email]>; "[hidden email]" <[hidden email]>
Sent: Thu, May 20, 2010 9:22:28 PM
Subject: Re: [maker-devel] GenemarkES install/bug problems

When you register to download GeneMark there will be a second link on the download page to download the license key.  There are two keys, one is 32 bit and one is 64 bit.  You must download the right key for your version of GeneMark.  Once you have the key, unpack it and move it to your home directory.  Then rename the key to .gm_key i.e. -> mv  gm_key  .gm_key

gm_es.pl generally requires at least 10 megabases of sequence to train according to the GeneMark documentation.  So the dpp_contig.fasta file is far to short, and you will probably get an error.  Just some info on how GeneMark works. The script gm_es.pl is actually a training script, and is not the prediction executable.  There is another executable called gmhmme3 that is the actual ab initio gene predictor and it requires that you provide it with a training file.  When you run gm_es.pl it will try and create a training file for you and will then run gmhmme3 (this is done silently, so if you are unfamiliar with the traditional ‘GeneMark.hmm eukaryotic’ you might not even know this is happening).

Why use gmhmme3 instead of always just using gm_es.pl which will eventually call gmhmme3 for you?  The reason is because gm_es.pl can take up to 24 hours to build a training file, whereas gmhmme3 with an existing training file takes less than 2 minutes to run.  You can build training files using the self training script gm_es.pl as already mentioned, or using pre-existing gene models as input which is done with ‘GeneMark.hmm eukaryotic’ .

The file you want when gm_es.pl is finished will be in a directory called mod/ as part of gm_es.pl’s output.  You will pass that training file to MAKER, and MAKER can run gmhmme3 to produce ab initio gene predictions in any way it wants.

If you want a list of precomputed training files, download ‘GeneMark.hmm eukaryotic’.  It comes with a number of .mod files that can be used with the gmhmme3 executable from GeneMark-ES.

I hope that helps.  Just a couple more notes.  GeneMark does not work on Macs, and on some versions of Linux, it produces a segmentation fault.  There is really no way to get around those errors because GeneMark is distributed as a precomputed binary package and not as source code that can be compiled for your machine.

Thanks,
Carson

On 5/20/10 6:49 PM, "katebush" <[hidden email]> wrote:

Hello,

Trying to install GenemarkES for linux 64 bit.  First does anyone know
how to install the license key.  That was my first error.  I untarred
the gm_key_64.tar, copied to my home dir?  When running a testrun on
the maker test contig:

/usr/local/bin/gm_es_bp_linux64_v2.3a/gm_es.pl  dpp.contigs

I get the following errors. It doesn't seem to be formatting a
dna.fa.good.taa...at least there is nothing in it.  Any ideas on what
the problem may be greatly appreciated.

running hmm2nt.a2
1 files IN
Clusters were defined as:
 0 <= GC% <= 99
99 < GC% <= 99
99 < GC% <= 100

Parsing dna.fa.good.cod

Program complete
----------------
1 sequences found
2 dna.fa.good.ini
zero order for Ini
GC Range: (0,99)
1 sequences of length 12 used from 1
total sequences in dna.fa.good.ini
Generating model...
T    0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00
C    0.00 1.00 0.00 0.00 1.00 1.00 0.00 0.00 0.00 1.00 0.00 1.00
A    0.00 0.00 0.00 1.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00
G    1.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 1.00 0.00
Done
2 lines read from dna.fa.good.ter
1 sequences obtained
1 comment lines
0 lines contained no sequence (or improperly formatted seq)
0 sequences used TAA
0 sequences used TAG
1 sequences used TGA
0 sequences did not begin with a stop codon
All lines accounted for
Done
0 dna.fa.good.taa
zero order for TAA
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104.
0 sequences of length  used from 0
total sequences in dna.fa.good.taa
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
G
Done
0 dna.fa.good.tag
zero order for TAG
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104.
0 sequences of length  used from 0
total sequences in dna.fa.good.tag
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117.
G
Done
1 dna.fa.good.tga
zero order for TGA
GC Range: (0,99)
1 sequences of length 12 used from 1
total sequences in dna.fa.good.tga
Generating model...
T    1.00 0.00 0.00 1.00 1.00 0.00 0.00 1.00 1.00 1.00 1.00 0.00
C    0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00
A    0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
G    0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00
Done
1 dna.fa.good.don
zero order for DON
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104, <IN> line 1.
0 sequences of length  used from 0
total sequences in dna.fa.good.don
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110, <IN> line 1.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
G
Done
1 dna.fa.good.acc
zero order for ACC
GC Range: (0,99)
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
104, <IN> line 1.
0 sequences of length  used from 0
total sequences in dna.fa.good.acc
Generating model...
Use of uninitialized value in concatenation (.) or string at /raid0/
local/cluster/spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line
110, <IN> line 1.
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
T
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
C
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
A
Use of uninitialized value in subtraction (-) at /raid0/local/cluster/
spatafora/gm_es_bp_linux64_v2.3a/make_nt_freq.mat line 117, <IN> line
1.
G
Done
error reading parameter TERM_TAA_MAT
error in model file org_S1.0mtx
Error on system: prediction step

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org



_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org