maker predicting only part of a gene

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

maker predicting only part of a gene

Das, Debojyoti
Hi Carson,

I am working on a non-model reptile species.   I tried running maker with est2genome=1 and protein2genome=1 with the following evidence:

1. est=transcriptome.fasta (de novo assembled)
2. protein in fasta format from two related species. 

 
 I keep getting predictions where gene models identify multi-exon genes but fail to incorporate all the exons even though they are present on the scaffolds. The fact that in the predicted gene models some exons were correctly identified while missing others even though the entire gene is present on the scaffolds, we checked this by loading the annotation in IGV Viewer.  

Since we are not completely confident of our transciptome assembly, we thought of using cDNA from a closely related species.
1. altest="cDNA in fasta format from a related species"
2. protein in fasta format from two related species.

However, when I do this I get the error 
"ERROR: You must provide some form of EST evidence to use est2genome as a predictor."

Interestingly, if I switch est2genome off (setting it to zero) maker starts running. 

Any suggestions on how to proceed.

Best,
Debojyoti



_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: maker predicting only part of a gene

Carson Holt-2
Both  est2genome=1 and protein2genome=1 do not predict genes. They simply transfer exonerate alignments which match ORFs into gene models. It’s good enough to train a predictor like SNAP or Augustus, but should not be used as the final models. If you review the documentation you will see that they should be turned off once you train a predictor.



—Carson


On Aug 8, 2019, at 1:48 PM, Das, Debojyoti <[hidden email]> wrote:

Hi Carson,

I am working on a non-model reptile species.   I tried running maker with est2genome=1 and protein2genome=1 with the following evidence:

1. est=transcriptome.fasta (de novo assembled)
2. protein in fasta format from two related species. 

 
 I keep getting predictions where gene models identify multi-exon genes but fail to incorporate all the exons even though they are present on the scaffolds. The fact that in the predicted gene models some exons were correctly identified while missing others even though the entire gene is present on the scaffolds, we checked this by loading the annotation in IGV Viewer.  

Since we are not completely confident of our transciptome assembly, we thought of using cDNA from a closely related species.
1. altest="cDNA in fasta format from a related species"
2. protein in fasta format from two related species.

However, when I do this I get the error 
"ERROR: You must provide some form of EST evidence to use est2genome as a predictor."

Interestingly, if I switch est2genome off (setting it to zero) maker starts running. 

Any suggestions on how to proceed.

Best,
Debojyoti


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org